Extending Hidden Markov (tree) models for word representations

نویسندگان

Simon Šuster

Gertjan van Noord

چکیده

There is ample research in natural language processing (NLP) on obtaining word representations, including vector space modeling, clustering and techniques derived from language models. Good word representations are vital for overcoming the lexical sparseness inherent to many NLP problems. Much less studied are approaches capturing wider or global context (see e.g. Nepal and Yates (2014)). We are interested in using syntax for learning semantic classes, by which a wider and more relevant context can be incorporated (Šuster & Van Noord, 2014). It has been shown that dependency trees can extend a) Hidden Markov Models (HMM) in a way that resulting word representations increase performance in NLP classification tasks (Grave et al., 2013), but also b) Brown clusters, resulting in higher similarity scores in a wordnet-based experiment (Šuster & Van Noord, 2014). A drawback of the existing approaches is that trees are exploited only partially—dependency links set the structure (word context), but the identity of dependency links is not part of the model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word Representations, Tree Models and Syntactic Functions

Word representations induced from models with discrete latent variables (e.g. HMMs) have been shown to be beneficial in many NLP applications. In this work, we exploit labeled syntactic dependency trees and formalize the induction problem as unsupervised learning of tree-structured hidden Markov models. Syntactic functions are used as additional observed variables in the model, influencing both...

متن کامل

Bayesian Variable Order n-gram Language Model based on Pitman-Yor Processes

This paper proposes a variable order n-gram language model by extending a recently proposed model based on the hierarchical Pitman-Yor processes. Introducing a stochastic process on an infinite depth suffix tree, we can infer the hidden n-gram context from which each word originated. Experiments on standard large corpora showed validity and efficiency of the proposed model. Our architecture is ...

متن کامل

Hidden Markov Tree Model for Word Alignment

We propose a novel unsupervised word alignment model based on the Hidden Markov Tree (HMT) model. Our model assumes that the alignment variables have a tree structure which is isomorphic to the target dependency tree and models the distortion probability based on the source dependency tree, thereby incorporating the syntactic structure from both sides of the parallel sentences. In English-Japan...

متن کامل

Language modelling for efficient beam-search

This paper considers the problems of estimating bigram language models and of efficiently representing them by a finite state network, which can be employed by an hidden Markov model based, beam-search, continuous speech recognizer. A review of the best known bigram estimation techniques is given together with a description of the original Stacked model. Language model comparisons in terms of p...

متن کامل

Introducing Busy Customer Portfolio Using Hidden Markov Model

Due to the effective role of Markov models in customer relationship management (CRM), there is a lack of comprehensive literature review which contains all related literatures. In this paper the focus is on academic databases to find all the articles that had been published in 2011 and earlier. One hundred articles were identified and reviewed to find direct relevance for applying Markov models...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Extending Hidden Markov (tree) models for word representations

نویسندگان

چکیده

منابع مشابه

Word Representations, Tree Models and Syntactic Functions

Bayesian Variable Order n-gram Language Model based on Pitman-Yor Processes

Hidden Markov Tree Model for Word Alignment

Language modelling for efficient beam-search

Introducing Busy Customer Portfolio Using Hidden Markov Model

عنوان ژورنال:

اشتراک گذاری